Overview

Dataset statistics

Number of variables14
Number of observations4000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory437.6 KiB
Average record size in memory112.0 B

Variable types

Numeric5
Categorical8
DateTime1

Warnings

days_since_last_call is highly correlated with num_contacts_prevHigh correlation
num_contacts_prev is highly correlated with days_since_last_callHigh correlation
prev_call_duration is highly correlated with subs_depositHigh correlation
days_since_last_call is highly correlated with num_contacts_prevHigh correlation
num_contacts_prev is highly correlated with days_since_last_callHigh correlation
subs_deposit is highly correlated with prev_call_durationHigh correlation
days_since_last_call is highly correlated with num_contacts_prevHigh correlation
num_contacts_prev is highly correlated with days_since_last_callHigh correlation
has_housing_loan is highly correlated with has_personal_loanHigh correlation
marital is highly correlated with age_bracketHigh correlation
cpi is highly correlated with contact_dateHigh correlation
client_id is highly correlated with contact_dateHigh correlation
job is highly correlated with educationHigh correlation
education is highly correlated with jobHigh correlation
contact_date is highly correlated with cpi and 1 other fieldsHigh correlation
poutcome is highly correlated with num_contacts_prev and 1 other fieldsHigh correlation
has_personal_loan is highly correlated with has_housing_loanHigh correlation
num_contacts_prev is highly correlated with poutcome and 1 other fieldsHigh correlation
days_since_last_call is highly correlated with poutcome and 1 other fieldsHigh correlation
age_bracket is highly correlated with maritalHigh correlation
has_housing_loan is highly correlated with has_personal_loanHigh correlation
has_personal_loan is highly correlated with has_housing_loanHigh correlation
client_id has unique values Unique
num_contacts_prev has 3219 (80.5%) zeros Zeros

Reproduction

Analysis started2022-04-19 05:12:09.545437
Analysis finished2022-04-19 05:15:44.309119
Duration3 minutes and 34.76 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

client_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct4000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22430.64275
Minimum17
Maximum41186
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.4 KiB
2022-04-19T15:15:44.382359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile2543.9
Q112408.25
median23336.5
Q332990
95-th percentile39701.3
Maximum41186
Range41169
Interquartile range (IQR)20581.75

Descriptive statistics

Standard deviation12052.91754
Coefficient of variation (CV)0.5373416034
Kurtosis-1.20571
Mean22430.64275
Median Absolute Deviation (MAD)10169
Skewness-0.1650511913
Sum89722571
Variance145272821.2
MonotonicityNot monotonic
2022-04-19T15:15:44.495412image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
410201
 
< 0.1%
15271
 
< 0.1%
228191
 
< 0.1%
276001
 
< 0.1%
372451
 
< 0.1%
226281
 
< 0.1%
287591
 
< 0.1%
330351
 
< 0.1%
220121
 
< 0.1%
357631
 
< 0.1%
Other values (3990)3990
99.8%
ValueCountFrequency (%)
171
< 0.1%
531
< 0.1%
681
< 0.1%
881
< 0.1%
1011
< 0.1%
1151
< 0.1%
1391
< 0.1%
1801
< 0.1%
1921
< 0.1%
2381
< 0.1%
ValueCountFrequency (%)
411861
< 0.1%
411851
< 0.1%
411811
< 0.1%
411671
< 0.1%
411631
< 0.1%
411481
< 0.1%
411331
< 0.1%
411311
< 0.1%
411181
< 0.1%
411031
< 0.1%

age_bracket
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
25-40
2161 
41-60
1544 
18-24
 
148
60+
 
147

Length

Max length5
Median length5
Mean length4.9265
Min length3

Characters and Unicode

Total characters19706
Distinct characters9
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row41-60
2nd row60+
3rd row41-60
4th row25-40
5th row18-24

Common Values

ValueCountFrequency (%)
25-402161
54.0%
41-601544
38.6%
18-24148
 
3.7%
60+147
 
3.7%

Length

2022-04-19T15:15:44.720638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:44.792444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
25-402161
54.0%
41-601544
38.6%
18-24148
 
3.7%
60147
 
3.7%

Most occurring characters

ValueCountFrequency (%)
43853
19.6%
-3853
19.6%
03852
19.5%
22309
11.7%
52161
11.0%
11692
8.6%
61691
8.6%
8148
 
0.8%
+147
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number15706
79.7%
Dash Punctuation3853
 
19.6%
Math Symbol147
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
43853
24.5%
03852
24.5%
22309
14.7%
52161
13.8%
11692
10.8%
61691
10.8%
8148
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
-3853
100.0%
Math Symbol
ValueCountFrequency (%)
+147
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common19706
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
43853
19.6%
-3853
19.6%
03852
19.5%
22309
11.7%
52161
11.0%
11692
8.6%
61691
8.6%
8148
 
0.8%
+147
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII19706
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
43853
19.6%
-3853
19.6%
03852
19.5%
22309
11.7%
52161
11.0%
11692
8.6%
61691
8.6%
8148
 
0.8%
+147
 
0.7%

job
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
white-collar
1366 
blue-collar
769 
technician
640 
other
503 
pink-collar
455 
Other values (2)
267 

Length

Max length13
Median length11
Mean length10.532
Min length5

Characters and Unicode

Total characters42128
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwhite-collar
2nd rowother
3rd rowwhite-collar
4th rowtechnician
5th rowwhite-collar

Common Values

ValueCountFrequency (%)
white-collar1366
34.2%
blue-collar769
19.2%
technician640
16.0%
other503
 
12.6%
pink-collar455
 
11.4%
self-employed153
 
3.8%
entrepreneur114
 
2.9%

Length

2022-04-19T15:15:44.966525image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:45.043406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
white-collar1366
34.2%
blue-collar769
19.2%
technician640
16.0%
other503
 
12.6%
pink-collar455
 
11.4%
self-employed153
 
3.8%
entrepreneur114
 
2.9%

Most occurring characters

ValueCountFrequency (%)
l6255
14.8%
e4193
10.0%
c3870
9.2%
r3435
8.2%
o3246
7.7%
a3230
7.7%
i3101
7.4%
-2743
 
6.5%
t2623
 
6.2%
h2509
 
6.0%
Other values (11)6923
16.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter39385
93.5%
Dash Punctuation2743
 
6.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l6255
15.9%
e4193
10.6%
c3870
9.8%
r3435
8.7%
o3246
8.2%
a3230
8.2%
i3101
7.9%
t2623
6.7%
h2509
6.4%
n1963
 
5.0%
Other values (10)4960
12.6%
Dash Punctuation
ValueCountFrequency (%)
-2743
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin39385
93.5%
Common2743
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l6255
15.9%
e4193
10.6%
c3870
9.8%
r3435
8.7%
o3246
8.2%
a3230
8.2%
i3101
7.9%
t2623
6.7%
h2509
6.4%
n1963
 
5.0%
Other values (10)4960
12.6%
Common
ValueCountFrequency (%)
-2743
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII42128
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l6255
14.8%
e4193
10.0%
c3870
9.2%
r3435
8.2%
o3246
7.7%
a3230
7.7%
i3101
7.4%
-2743
 
6.5%
t2623
 
6.2%
h2509
 
6.0%
Other values (11)6923
16.4%

marital
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
married
2374 
single
1176 
divorced
442 
unknown
 
8

Length

Max length8
Median length7
Mean length6.8165
Min length6

Characters and Unicode

Total characters27266
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdivorced
2nd rowdivorced
3rd rowmarried
4th rowsingle
5th rowsingle

Common Values

ValueCountFrequency (%)
married2374
59.4%
single1176
29.4%
divorced442
 
11.1%
unknown8
 
0.2%

Length

2022-04-19T15:15:45.224249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:45.292819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
married2374
59.4%
single1176
29.4%
divorced442
 
11.1%
unknown8
 
0.2%

Most occurring characters

ValueCountFrequency (%)
r5190
19.0%
i3992
14.6%
e3992
14.6%
d3258
11.9%
m2374
8.7%
a2374
8.7%
n1200
 
4.4%
s1176
 
4.3%
g1176
 
4.3%
l1176
 
4.3%
Other values (6)1358
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter27266
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r5190
19.0%
i3992
14.6%
e3992
14.6%
d3258
11.9%
m2374
8.7%
a2374
8.7%
n1200
 
4.4%
s1176
 
4.3%
g1176
 
4.3%
l1176
 
4.3%
Other values (6)1358
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Latin27266
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r5190
19.0%
i3992
14.6%
e3992
14.6%
d3258
11.9%
m2374
8.7%
a2374
8.7%
n1200
 
4.4%
s1176
 
4.3%
g1176
 
4.3%
l1176
 
4.3%
Other values (6)1358
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII27266
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r5190
19.0%
i3992
14.6%
e3992
14.6%
d3258
11.9%
m2374
8.7%
a2374
8.7%
n1200
 
4.4%
s1176
 
4.3%
g1176
 
4.3%
l1176
 
4.3%
Other values (6)1358
 
5.0%

education
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
bachelors
1274 
secondary
1114 
senior_secondary
908 
masters
524 
unknown
176 

Length

Max length16
Median length9
Mean length10.24
Min length7

Characters and Unicode

Total characters40960
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowbachelors
2nd rowsecondary
3rd rowbachelors
4th rowsenior_secondary
5th rowbachelors

Common Values

ValueCountFrequency (%)
bachelors1274
31.9%
secondary1114
27.9%
senior_secondary908
22.7%
masters524
13.1%
unknown176
 
4.4%
illiterate4
 
0.1%

Length

2022-04-19T15:15:45.460775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:45.523603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
bachelors1274
31.9%
secondary1114
27.9%
senior_secondary908
22.7%
masters524
13.1%
unknown176
 
4.4%
illiterate4
 
0.1%

Most occurring characters

ValueCountFrequency (%)
s5252
12.8%
e4736
11.6%
r4732
11.6%
o4380
10.7%
a3824
9.3%
n3458
8.4%
c3296
8.0%
d2022
 
4.9%
y2022
 
4.9%
l1282
 
3.1%
Other values (9)5956
14.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter40052
97.8%
Connector Punctuation908
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s5252
13.1%
e4736
11.8%
r4732
11.8%
o4380
10.9%
a3824
9.5%
n3458
8.6%
c3296
8.2%
d2022
 
5.0%
y2022
 
5.0%
l1282
 
3.2%
Other values (8)5048
12.6%
Connector Punctuation
ValueCountFrequency (%)
_908
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin40052
97.8%
Common908
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
s5252
13.1%
e4736
11.8%
r4732
11.8%
o4380
10.9%
a3824
9.5%
n3458
8.6%
c3296
8.2%
d2022
 
5.0%
y2022
 
5.0%
l1282
 
3.2%
Other values (8)5048
12.6%
Common
ValueCountFrequency (%)
_908
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII40960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s5252
12.8%
e4736
11.6%
r4732
11.6%
o4380
10.7%
a3824
9.3%
n3458
8.4%
c3296
8.0%
d2022
 
4.9%
y2022
 
4.9%
l1282
 
3.1%
Other values (9)5956
14.5%

has_housing_loan
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
yes
2115 
no
1793 
unknown
 
92

Length

Max length7
Median length3
Mean length2.64375
Min length2

Characters and Unicode

Total characters10575
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowyes
2nd rowno
3rd rowno
4th rowyes
5th rowno

Common Values

ValueCountFrequency (%)
yes2115
52.9%
no1793
44.8%
unknown92
 
2.3%

Length

2022-04-19T15:15:45.706734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:45.770772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
yes2115
52.9%
no1793
44.8%
unknown92
 
2.3%

Most occurring characters

ValueCountFrequency (%)
y2115
20.0%
e2115
20.0%
s2115
20.0%
n2069
19.6%
o1885
17.8%
u92
 
0.9%
k92
 
0.9%
w92
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter10575
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y2115
20.0%
e2115
20.0%
s2115
20.0%
n2069
19.6%
o1885
17.8%
u92
 
0.9%
k92
 
0.9%
w92
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin10575
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
y2115
20.0%
e2115
20.0%
s2115
20.0%
n2069
19.6%
o1885
17.8%
u92
 
0.9%
k92
 
0.9%
w92
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII10575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y2115
20.0%
e2115
20.0%
s2115
20.0%
n2069
19.6%
o1885
17.8%
u92
 
0.9%
k92
 
0.9%
w92
 
0.9%

has_personal_loan
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
no
3335 
yes
573 
unknown
 
92

Length

Max length7
Median length2
Mean length2.25825
Min length2

Characters and Unicode

Total characters9033
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowyes
3rd rowno
4th rowyes
5th rowno

Common Values

ValueCountFrequency (%)
no3335
83.4%
yes573
 
14.3%
unknown92
 
2.3%

Length

2022-04-19T15:15:45.926674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:45.986313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
no3335
83.4%
yes573
 
14.3%
unknown92
 
2.3%

Most occurring characters

ValueCountFrequency (%)
n3611
40.0%
o3427
37.9%
y573
 
6.3%
e573
 
6.3%
s573
 
6.3%
u92
 
1.0%
k92
 
1.0%
w92
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9033
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n3611
40.0%
o3427
37.9%
y573
 
6.3%
e573
 
6.3%
s573
 
6.3%
u92
 
1.0%
k92
 
1.0%
w92
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin9033
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n3611
40.0%
o3427
37.9%
y573
 
6.3%
e573
 
6.3%
s573
 
6.3%
u92
 
1.0%
k92
 
1.0%
w92
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII9033
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n3611
40.0%
o3427
37.9%
y573
 
6.3%
e573
 
6.3%
s573
 
6.3%
u92
 
1.0%
k92
 
1.0%
w92
 
1.0%

prev_call_duration
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1032
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3871.14225
Minimum2
Maximum419900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.4 KiB
2022-04-19T15:15:46.064899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile50
Q1131
median237
Q3461
95-th percentile1030.1
Maximum419900
Range419898
Interquartile range (IQR)330

Descriptive statistics

Standard deviation26080.54905
Coefficient of variation (CV)6.737171452
Kurtosis73.05096721
Mean3871.14225
Median Absolute Deviation (MAD)133
Skewness8.09394563
Sum15484569
Variance680195038.9
MonotonicityNot monotonic
2022-04-19T15:15:46.177208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7220
 
0.5%
15919
 
0.5%
16519
 
0.5%
13119
 
0.5%
16419
 
0.5%
15718
 
0.4%
7617
 
0.4%
15617
 
0.4%
16116
 
0.4%
18716
 
0.4%
Other values (1022)3820
95.5%
ValueCountFrequency (%)
21
 
< 0.1%
31
 
< 0.1%
44
0.1%
52
 
0.1%
66
0.1%
75
0.1%
83
0.1%
91
 
< 0.1%
104
0.1%
113
0.1%
ValueCountFrequency (%)
4199001
< 0.1%
3643001
< 0.1%
3094001
< 0.1%
2926001
< 0.1%
2769001
< 0.1%
2680001
< 0.1%
2653001
< 0.1%
2486001
< 0.1%
2462001
< 0.1%
2456001
< 0.1%

days_since_last_call
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct24
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean903.15075
Minimum0
Maximum999
Zeros5
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size31.4 KiB
2022-04-19T15:15:46.280238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q1999
median999
Q3999
95-th percentile999
Maximum999
Range999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation293.323535
Coefficient of variation (CV)0.3247780451
Kurtosis5.478469293
Mean903.15075
Median Absolute Deviation (MAD)0
Skewness-2.734145111
Sum3612603
Variance86038.6962
MonotonicityNot monotonic
2022-04-19T15:15:46.375678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
9993614
90.3%
3119
 
3.0%
6118
 
2.9%
427
 
0.7%
516
 
0.4%
214
 
0.4%
1013
 
0.3%
712
 
0.3%
129
 
0.2%
119
 
0.2%
Other values (14)49
 
1.2%
ValueCountFrequency (%)
05
 
0.1%
16
 
0.1%
214
 
0.4%
3119
3.0%
427
 
0.7%
516
 
0.4%
6118
2.9%
712
 
0.3%
85
 
0.1%
99
 
0.2%
ValueCountFrequency (%)
9993614
90.3%
271
 
< 0.1%
261
 
< 0.1%
251
 
< 0.1%
221
 
< 0.1%
181
 
< 0.1%
171
 
< 0.1%
163
 
0.1%
153
 
0.1%
144
 
0.1%

num_contacts_prev
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.272
Minimum0
Maximum6
Zeros3219
Zeros (%)80.5%
Negative0
Negative (%)0.0%
Memory size31.4 KiB
2022-04-19T15:15:46.463793image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.643132494
Coefficient of variation (CV)2.364457699
Kurtosis11.57486418
Mean0.272
Median Absolute Deviation (MAD)0
Skewness3.035775148
Sum1088
Variance0.4136194049
MonotonicityNot monotonic
2022-04-19T15:15:46.541246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
03219
80.5%
1567
 
14.2%
2144
 
3.6%
354
 
1.4%
410
 
0.2%
55
 
0.1%
61
 
< 0.1%
ValueCountFrequency (%)
03219
80.5%
1567
 
14.2%
2144
 
3.6%
354
 
1.4%
410
 
0.2%
55
 
0.1%
61
 
< 0.1%
ValueCountFrequency (%)
61
 
< 0.1%
55
 
0.1%
410
 
0.2%
354
 
1.4%
2144
 
3.6%
1567
 
14.2%
03219
80.5%

poutcome
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
nonexistent
3219 
failure
419 
success
362 

Length

Max length11
Median length11
Mean length10.219
Min length7

Characters and Unicode

Total characters40876
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsuccess
2nd rowsuccess
3rd rownonexistent
4th rownonexistent
5th rownonexistent

Common Values

ValueCountFrequency (%)
nonexistent3219
80.5%
failure419
 
10.5%
success362
 
9.0%

Length

2022-04-19T15:15:46.752348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:46.824114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
nonexistent3219
80.5%
failure419
 
10.5%
success362
 
9.0%

Most occurring characters

ValueCountFrequency (%)
n9657
23.6%
e7219
17.7%
t6438
15.8%
s4305
10.5%
i3638
 
8.9%
o3219
 
7.9%
x3219
 
7.9%
u781
 
1.9%
c724
 
1.8%
f419
 
1.0%
Other values (3)1257
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter40876
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n9657
23.6%
e7219
17.7%
t6438
15.8%
s4305
10.5%
i3638
 
8.9%
o3219
 
7.9%
x3219
 
7.9%
u781
 
1.9%
c724
 
1.8%
f419
 
1.0%
Other values (3)1257
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
Latin40876
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n9657
23.6%
e7219
17.7%
t6438
15.8%
s4305
10.5%
i3638
 
8.9%
o3219
 
7.9%
x3219
 
7.9%
u781
 
1.9%
c724
 
1.8%
f419
 
1.0%
Other values (3)1257
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII40876
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n9657
23.6%
e7219
17.7%
t6438
15.8%
s4305
10.5%
i3638
 
8.9%
o3219
 
7.9%
x3219
 
7.9%
u781
 
1.9%
c724
 
1.8%
f419
 
1.0%
Other values (3)1257
 
3.1%

contact_date
Date

HIGH CORRELATION

Distinct50
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
Minimum2018-01-03 00:00:00
Maximum2018-07-12 00:00:00
2022-04-19T15:15:46.909937image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:47.036606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

cpi
Real number (ℝ≥0)

HIGH CORRELATION

Distinct25
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean107.348378
Minimum92.201
Maximum947.67
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.4 KiB
2022-04-19T15:15:47.162474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum92.201
5-th percentile92.431
Q192.963
median93.444
Q393.994
95-th percentile94.465
Maximum947.67
Range855.469
Interquartile range (IQR)1.031

Descriptive statistics

Standard deviation107.8854692
Coefficient of variation (CV)1.005003254
Kurtosis56.62368257
Mean107.348378
Median Absolute Deviation (MAD)0.55
Skewness7.654627887
Sum429393.512
Variance11639.27446
MonotonicityNot monotonic
2022-04-19T15:15:47.250663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
93.994559
14.0%
93.918546
13.7%
92.893520
13.0%
93.444411
10.3%
94.465354
8.8%
93.2304
7.6%
93.075290
7.2%
92.201134
 
3.4%
92.963114
 
2.9%
92.43182
 
2.1%
Other values (15)686
17.2%
ValueCountFrequency (%)
92.201134
 
3.4%
92.37945
 
1.1%
92.43182
 
2.1%
92.46936
 
0.9%
92.64967
 
1.7%
92.71331
 
0.8%
92.84350
 
1.2%
92.893520
13.0%
92.963114
 
2.9%
93.075290
7.2%
ValueCountFrequency (%)
947.6725
 
0.6%
946.0140
 
1.0%
94.465354
8.8%
94.21574
 
1.8%
94.19957
 
1.4%
94.05545
 
1.1%
94.02754
 
1.4%
93.994559
14.0%
93.918546
13.7%
93.87648
 
1.2%

subs_deposit
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.4 KiB
0
2410 
1
1590 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
02410
60.2%
11590
39.8%

Length

2022-04-19T15:15:47.423366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-19T15:15:47.480640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
02410
60.2%
11590
39.8%

Most occurring characters

ValueCountFrequency (%)
02410
60.2%
11590
39.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02410
60.2%
11590
39.8%

Most occurring scripts

ValueCountFrequency (%)
Common4000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02410
60.2%
11590
39.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII4000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02410
60.2%
11590
39.8%

Interactions

2022-04-19T15:12:13.724412image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:12:55.346951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:13:23.718532image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:13:52.057925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:14:21.465528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:14:50.139775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:03.330065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:03.417462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:03.505960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:03.594806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:03.686587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:16.945114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:17.035576image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:17.125794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:17.215709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:17.307861image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:29.832293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:29.923420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:30.014600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:30.107381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:30.201343image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:43.589140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:43.670246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:43.753073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-19T15:15:43.836489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-04-19T15:15:47.525161image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-19T15:15:47.633215image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-19T15:15:47.743786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-19T15:15:47.870085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-04-19T15:15:48.030492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-04-19T15:15:43.985089image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-19T15:15:44.212434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

client_idage_bracketjobmaritaleducationhas_housing_loanhas_personal_loanprev_call_durationdays_since_last_callnum_contacts_prevpoutcomecontact_datecpisubs_deposit
04102041-60white-collardivorcedbachelorsyesno28331success2018-07-0992.3791
12372060+otherdivorcedsecondarynoyes16962success2018-05-0794.2151
22937841-60white-collarmarriedbachelorsnono5529990nonexistent2018-01-0893.4441
33663625-40techniciansinglesenior_secondaryyesyes2069990nonexistent2018-02-1193.2000
43822918-24white-collarsinglebachelorsnono3419990nonexistent2018-04-0493.0751
52720225-40self-employedmarriedsecondarynono819990nonexistent2018-06-0893.4440
6140960+white-collarmarriedbachelorsnono107661success2018-07-0592.8931
72437941-60othermarriedsenior_secondarynono1339990nonexistent2018-06-0793.9180
81003625-40blue-collarmarriedsecondarynono2539991failure2018-03-0592.8930
91811541-60self-employedmarriedbachelorsnono4679990nonexistent2018-01-0694.4651

Last rows

client_idage_bracketjobmaritaleducationhas_housing_loanhas_personal_loanprev_call_durationdays_since_last_callnum_contacts_prevpoutcomecontact_datecpisubs_deposit
39902421125-40blue-collarmarriedsenior_secondaryyesno1599990nonexistent2018-06-0793.9180
39912834841-60pink-collarmarriedsecondarynono779990nonexistent2018-07-0893.4440
3992473925-40blue-collarmarriedmastersyesno4189990nonexistent2018-01-0592.8930
3993238925-40blue-collardivorcedsecondarynono8279990nonexistent2018-07-0592.8931
39941024825-40techniciansinglesecondaryyesno139990nonexistent2018-03-0592.8930
3995751941-60entrepreneursinglesecondaryyesno3969990nonexistent2018-02-0592.8931
39962982241-60white-collarmarriedbachelorsyesno1159990nonexistent2018-01-0893.4440
39972446225-40white-collarmarriedsenior_secondaryyesno2149990nonexistent2018-06-0793.9180
39982608925-40pink-collarmarriedsecondaryyesno769990nonexistent2018-02-0793.9180
39994063125-40white-collarsinglebachelorsyesno3689990nonexistent2018-04-0992.3790